5 research outputs found

    Random deep neural networks are biased towards simple functions

    Full text link
    We prove that the binary classifiers of bit strings generated by random wide deep neural networks with ReLU activation function are biased towards simple functions. The simplicity is captured by the following two properties. For any given input bit string, the average Hamming distance of the closest input bit string with a different classification is at least sqrt(n / (2{\pi} log n)), where n is the length of the string. Moreover, if the bits of the initial string are flipped randomly, the average number of flips required to change the classification grows linearly with n. These results are confirmed by numerical experiments on deep neural networks with two hidden layers, and settle the conjecture stating that random deep neural networks are biased towards simple functions. This conjecture was proposed and numerically explored in [Valle P\'erez et al., ICLR 2019] to explain the unreasonably good generalization properties of deep learning algorithms. The probability distribution of the functions generated by random deep neural networks is a good choice for the prior probability distribution in the PAC-Bayesian generalization bounds. Our results constitute a fundamental step forward in the characterization of this distribution, therefore contributing to the understanding of the generalization properties of deep learning algorithms

    Quantum Earth Mover's Distance: A New Approach to Learning Quantum Data

    Get PDF
    Quantifying how far the output of a learning algorithm is from its target is an essential task in machine learning. However, in quantum settings, the loss landscapes of commonly used distance metrics often produce undesirable outcomes such as poor local minima and exponentially decaying gradients. As a new approach, we consider here the quantum earth mover's (EM) or Wasserstein-1 distance, recently proposed in [De Palma et al., arXiv:2009.04469] as a quantum analog to the classical EM distance. We show that the quantum EM distance possesses unique properties, not found in other commonly used quantum distance metrics, that make quantum learning more stable and efficient. We propose a quantum Wasserstein generative adversarial network (qWGAN) which takes advantage of the quantum EM distance and provides an efficient means of performing learning on quantum data. Our qWGAN requires resources polynomial in the number of qubits, and our numerical experiments demonstrate that it is capable of learning a diverse set of quantum data

    Quantum artificial intelligence : learning unitary transformations

    No full text
    Thesis: S.M., Massachusetts Institute of Technology, Department of Mechanical Engineering, May, 2020Cataloged from the official PDF of thesis.Includes bibliographical references (pages 77-83).Linear algebra is a simple yet elegant mathematical framework that serves as the mathematical bedrock for many scientific and engineering disciplines. Broadly defined as the study of linear equations represented as vectors and matrices, linear algebra provides a mathematical toolbox for manipulating and controlling many physical systems. For example, linear algebra is central to the modeling of quantum mechanical phenomena and machine learning algorithms. Within the broad landscape of matrices studied in linear algebra, unitary matrices stand apart for their special properties, namely that they preserve norms and have easy to calculate inverses. Interpreted from an algorithmic or control setting, unitary matrices are used to describe and manipulate many physical systems.Relevant to the current work, unitary matrices are commonly studied in quantum mechanics where they formulate the time evolution of quantum states and in artificial intelligence where they provide a means to construct stable learning algorithms by preserving norms. One natural question that arises when studying unitary matrices is how difficult it is to learn them. Such a question may arise, for example, when one would like to learn the dynamics of a quantum system or apply unitary transformations to data embedded into a machine learning algorithm. In this thesis, I examine the hardness of learning unitary matrices both in the context of deep learning and quantum computation. This work aims to both advance our general mathematical understanding of unitary matrices and provide a framework for integrating unitary matrices into classical or quantum algorithms. Different forms of parameterizing unitary matrices, both in the quantum and classical regimes, are compared in this work.In general, experiments show that learning an arbitrary dxd² unitary matrix requires at least d² parameters in the learning algorithm regardless of the parameterization considered. In classical (non-quantum) settings, unitary matrices can be constructed by composing products of operators that act on smaller subspaces of the unitary manifold. In the quantum setting, there also exists the possibility of parameterizing unitary matrices in the Hamiltonian setting, where it is shown that repeatedly applying two alternating Hamiltonians is sufficient to learn an arbitrary unitary matrix. Finally, I discuss applications of this work in quantum and deep learning settings. For near term quantum computers, applying a desired set of gates may not be efficiently possible. Instead, desired unitary matrices can be learned from a given set of available gates (similar to ideas discussed in quantum controls).Understanding the learnability of unitary matrices can also aid efforts to integrate unitary matrices into neural networks and quantum deep learning algorithms. For example, deep learning algorithms implemented in quantum computers may leverage parameterizations discussed here to form layers in a quantum learning architecture.by Bobak Toussi Kiani.S.M.S.M. Massachusetts Institute of Technology, Department of Mechanical Engineerin
    corecore